-
Notifications
You must be signed in to change notification settings - Fork 0
Feature: PhotoBuilder — AI image generation for content pages #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… unit tests New vertical for AI image generation matching web page content: - Domain: PhotoSession/PhotoImage entities, enums, PhotoBuilderService with IMAGE_COUNT constant - Infrastructure: PromptGenerator (NeuronAI agent with deliver_image_prompt tool), ImageGenerator (OpenAI Images API), GeneratedImageStorage, Messenger messages/handlers - Tests: 45 unit tests covering entities, service logic, storage, and image generator Co-authored-by: Cursor <cursoragent@cursor.com>
…translations - PhotoBuilderController with all API endpoints (create session, poll, regenerate, serve image, upload to media store) - Twig template with loading state, user prompt, responsive image grid, media store sidebar - Two Stimulus controllers: photo_builder_controller.ts (orchestrator) and photo_image_controller.ts (per-card state management) - EN+DE translations for all PhotoBuilder UI strings - ImagePromptResultDto to replace associative arrays at boundaries - Registered new controllers in bootstrap.ts and asset_mapper.yaml - Service wiring in services.yaml, Twig namespace in twig.yaml - All quality checks pass (PHPStan, ESLint, tsc, Prettier, PHP CS Fixer) Co-authored-by: Cursor <cursoragent@cursor.com>
…s for PhotoBuilder - Wire PhotoBuilder CTA (camera icon) into dist_files_controller for each page - Add prefillMessage support to chat-based-content-editor controller for the "Embed generated images into content page" flow - Register PhotoBuilder entities in doctrine.yaml and generate migration for photo_sessions and photo_images tables - Add Vitest tests for photo_builder_controller (23 tests) and photo_image_controller (25 tests) - Add tests for PhotoBuilder CTA in dist_files_controller (5 tests) and prefillMessage in chat_based_content_editor_controller (3 tests) Co-authored-by: Cursor <cursoragent@cursor.com>
…ms, image serving
- Replace invalid placeholder strings (___SESSION_ID___) in Twig template
with dummy UUIDs that satisfy Symfony route parameter requirements
- Use output_format instead of response_format for gpt-image-1 API
(response_format is a dall-e-2/dall-e-3 parameter)
- Generate image URLs via Symfony router to include locale prefix,
fixing broken image display due to missing /{_locale}/ in path
- Update vertical-wiring.md with PhotoBuilder facade dependencies
- Update corresponding unit and frontend tests
Co-authored-by: Cursor <cursoragent@cursor.com>
…dback, TestHarness - Use etfswui-* styleguide classes on PhotoBuilder page (buttons, cards, forms) - Add cursor-pointer to all CTAs via styleguide button classes - Extract Remote Assets sidebar to @common.presentation/_remote_asset_browser_sidebar.html.twig - Include shared partial in chat_based_content_editor and photo_builder - Show 'Upload has been finished' banner on PhotoBuilder when upload completes (auto-hide 5s) - Add PhotoBuilder TestHarness: FakePromptGenerator, FakeImageGenerator, env toggles - PHOTO_BUILDER_SIMULATE_IMAGE_PROMPT_GENERATION and PHOTO_BUILDER_SIMULATE_IMAGE_GENERATION in .env - Fix OpenAI image API (output_format for gpt-image-1), poll image URLs, Stimulus action wiring - IMAGE_COUNT=1 for faster testing; docs/frontendbook.md and vertical-wiring.md updates Co-authored-by: Cursor <cursoragent@cursor.com>
…er query params - Show 'Upload has been finished' banner when image-card upload succeeds (not only sidebar) - Regenerate prompts: overlay + spinner, clear unprotected prompt textareas on start - Hide overlay when poll returns generating state; add regenerating_prompts translation - Language switcher: preserve query string (page, conversationId) when switching locale on photo builder Co-authored-by: Cursor <cursoragent@cursor.com>
- Add uploadedToMediaStoreAt to PhotoImage to track S3 uploads - Persist upload state in uploadToMediaStore endpoint; idempotent when already uploaded - Include uploadedToMediaStore in poll response - Change embedIntoPage to async: upload non-uploaded images first, show 'Uploading images, please wait...' overlay, then navigate on success - Add translations for uploading_images (EN/DE) - Reset uploadedToMediaStoreAt when image is regenerated Co-authored-by: Cursor <cursoragent@cursor.com>
…prompt - Add uploadedFileName to PhotoImage for hash-prefixed S3 names in embed message - Pass keepImageIds from regenerate prompts to handler; skip regenerating kept images - Only dispatch image generation for changed prompts, not kept ones - Clear uploaded state when prompt is regenerated Co-authored-by: Cursor <cursoragent@cursor.com>
Dispatch clearPromptIfNotKept event on each child card element instead of the parent — DOM events bubble upward, so dispatching on the parent never reached child controllers. Also show "Generating..." text with pulse animation immediately on non-kept prompts, disable buttons during regeneration, and document the parent-to-child event pattern in frontendbook. Co-authored-by: Cursor <cursoragent@cursor.com>
…ider configuration Introduce a two-tier LLM configuration system: content editing (OpenAI-only) and PhotoBuilder (OpenAI or Google Gemini). Projects can either reuse content editing settings for image generation or configure a dedicated provider/key. - Rename llmApiKey/llmModelProvider to contentEditing* scope across entity, DTOs, facades, controllers, templates, and tests - Add nullable photoBuilder* LLM fields with fallback to content editing - Extend LlmModelProvider enum with Google case and model selection methods - Extend LlmModelName enum with gpt-image-1, gemini-3-pro-preview, gemini-3-pro-image-preview - Implement GeminiImageGenerator adapter and ImageGeneratorFactory - Parameterize ImagePromptAgent to support both OpenAI and Gemini providers - Add Google API key verification via Gemini models endpoint - Add PhotoBuilder LLM settings UI (Option A: reuse / Option B: dedicated) with provider selection, key input, verification, and one-click reuse - Display active provider and model names on PhotoBuilder page - Add docs/llm-usage-book.md documenting all LLM concerns and configuration Co-authored-by: Cursor <cursoragent@cursor.com>
The Stimulus controller searched for the provider radio only within its own element, missing sibling radios in the same fieldset. This caused Google Gemini keys to be verified against OpenAI, always failing. Widen the lookup scope to the closest fieldset/form ancestor. Co-authored-by: Cursor <cursoragent@cursor.com>
…nly) Lo-res mode (1K, default) enables faster iteration; hi-res mode (2K) produces higher quality output. The toggle is only shown when the effective PhotoBuilder provider is Google Gemini, since OpenAI always generates 1024x1024. Switching modes re-generates all images client-side using current prompts at the new resolution without a page reload. Co-authored-by: Cursor <cursoragent@cursor.com>
Remove fixed container_name to allow scaling, add deploy.replicas: 5. Co-authored-by: Cursor <cursoragent@cursor.com>
After uploading images to S3 via the "Embed into page" action, poll
the remote asset manifests until all uploaded filenames are confirmed
available before redirecting. This prevents the content editor from
referencing images that haven't propagated to the CDN yet.
- Add findAvailableFileNames() to RemoteContentAssetsFacade (basename
matching against merged manifests) so the logic stays in the
RemoteContentAssets vertical
- Add thin POST endpoint in PhotoBuilderController that delegates to
the facade and returns { available, allAvailable }
- Frontend polls every 3s for up to 90s, showing a spinner overlay
- Includes PHP unit tests, frontend tests, and EN/DE translations
Co-authored-by: Cursor <cursoragent@cursor.com>
Use the faster and cheaper Flash model for generating image prompts in PhotoBuilder when the Google provider is selected. Pro remains the main text model for content editing. Co-authored-by: Cursor <cursoragent@cursor.com>
NeuronAI's Gemini provider only checks parts[0] for functionCall, but Gemini 3 models now return text/thought parts before functionCall parts, causing all tool calls to be silently missed (0 prompts). Introduce PatchedGemini provider that scans all parts and reindexes the tools array after filtering out non-functionCall parts. Co-authored-by: Cursor <cursoragent@cursor.com>
…from PhotoBuilder
- Add in-progress and success feedback near single-image Upload CTA (spinner + 'Uploading…', then checkmark + 'Uploaded'; dispatch uploadComplete/uploadFailed to card) - Fix upload/success spans always visible: use wrapper spans so only 'hidden' is toggled (no inline-flex vs hidden conflict) - Find card from event target for reliable completion/failure delivery - Add translations: uploading_to_media_store, uploaded_to_media_store - Stack Regenerate and Upload buttons vertically (flex-col) to fit space Co-authored-by: Cursor <cursoragent@cursor.com>
…ons textarea - Track lastAppliedUserPrompt; only apply server userPrompt from poll when current value matches it (or first load) so local edits are not overwritten - Add unit test: user edit preserved when poll runs with textarea unfocused Co-authored-by: Cursor <cursoragent@cursor.com>
The change-detection optimization skipped dispatching stateChanged to children when per-image data was unchanged, but children rely on that event to re-read the parent's data-photo-builder-generating attribute and enable/disable their Regenerate and Upload buttons. Now tracks anyGenerating transitions and force-dispatches to all cards when it changes. Co-authored-by: Cursor <cursoragent@cursor.com>
…uster The cache-buster is only applied when img.src is actually set (after a regeneration cycle clears lastSetImageUrl), so repeated polls with the same URL still skip re-assignment and avoid redundant fetches. Co-authored-by: Cursor <cursoragent@cursor.com>
…poll The promptAwaitingRegenerate logic only accepted new prompts when status was "pending" or "generating". If image generation completed within one poll cycle, status was already "completed" and the condition never matched, leaving the textarea permanently stuck. Now compares the incoming prompt against the saved pre-regeneration prompt instead of checking status, correctly handling both fast completions and stale old-data polls. Co-authored-by: Cursor <cursoragent@cursor.com>
- Redesign Preview Pages and PhotoBuilder CTA as distinct styleguide cards - Add translatable Edit HTML / Preview labels with proper icons - Make filenames clickable links to preview URLs - Move AI model info into Behind the Scenes section - Translate embed prefill message for German locale - Right-align Content Editor action buttons, use styleguide classes - Use etfswui-card-back-link for PhotoBuilder back link - Remove unused flex-row sidebar layout so content uses full width Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #90.
PhotoBuilder — AI Image Generation for Content Pages
Summary
New vertical
src/PhotoBuilder/that generates AI-driven images matching the visual tone and content of a web page. Users launch PhotoBuilder from the Content Editor, receive AI-generated image prompts based on the page HTML, review/edit prompts, generate images, upload them to the S3 media store, and embed them back into the page — all within one integrated workflow.Architecture
Vertical slice in
src/PhotoBuilder/following the established Domain / Infrastructure / Presentation layering, communicating with other verticals exclusively via facades.graph LR PhotoBuilder -->|"readWorkspaceFile (dist HTML)"| WorkspaceMgmt PhotoBuilder -->|"getProjectInfo (LLM config, S3)"| ProjectMgmt PhotoBuilder -->|"uploadAsset (S3)"| RemoteContentAssets PhotoBuilder -->|"findAvailableFileNames (manifest polling)"| RemoteContentAssets PhotoBuilder -->|"getAccountInfoByEmail"| Account ChatBasedContentEditor -.->|"CTA link in dist files"| PhotoBuilderFacade dependencies (documented in
docs/vertical-wiring.md):readWorkspaceFile(page HTML for prompt context)getProjectInfo(LLM API keys, S3 credentials, provider config)uploadAsset,findAvailableFileNames(S3 upload + CDN manifest polling)getAccountInfoByEmail(access validation)Vertical Structure
Domain Layer
Entities
PhotoSession — tracks one photo generation session per page:
id(UUID),workspaceId,conversationId,pagePathsystemPrompt,userPrompt(LLM prompt context)status(enum:generating_prompts,prompts_ready,generating_images,images_ready,failed)createdAtPhotoImage — tracks each generated image:
id(UUID),session(ManyToOne → PhotoSession),positionprompt,suggestedFileName(LLM-generated)status(enum:pending,generating,completed,failed)storagePath(relative path invar/photo-builder/)uploadedToMediaStoreAt,uploadedFileName(S3 upload tracking)errorMessageService
PhotoBuilderService orchestrates session lifecycle: creates sessions with IMAGE_COUNT empty image slots, updates prompts from LLM output, coordinates status transitions, and respects "keep" flags during prompt regeneration.
Infrastructure Layer
Multi-Provider LLM Support (OpenAI + Google Gemini)
The plan originally assumed OpenAI only. The implementation introduces a two-tier, multi-provider LLM configuration:
Prompt generation uses a NeuronAI
Agentwith adeliver_image_prompttool — the LLM calls this tool once per image, delivering structured{prompt, file_name}pairs. This tool-based approach avoids fragile JSON parsing. The agent supports both OpenAI and Gemini providers, parameterized viaImagePromptAgent.Image generation uses direct HTTP calls:
OpenAiImageGenerator: OpenAI Images API (gpt-image-1,b64_jsonresponse format)GeminiImageGenerator: Google Gemini API with native image generation (supports lo-res 1024px and hi-res 2048px modes)ImageGeneratorFactory: selects the appropriate generator based on project configurationPatchedGemini provider: NeuronAI's built-in Gemini provider only checks
parts[0]for function calls, but Gemini 3 models return text/thought parts before function call parts.PatchedGeminiscans all parts and reindexes correctly.Async Processing (Symfony Messenger)
Two message/handler pairs, dispatched through the immediate transport:
Messenger consumer scaled to 5 replicas (
docker-compose.yml) for parallel image processing.Image Storage
GeneratedImageStorage: filesystem adapter at
var/photo-builder/{sessionId}/{position}.pngwith save/read/getAbsolutePath methods.Presentation Layer
Controller
PhotoBuilderController with routes for:
GET /photo-builder/{workspaceId})Access control via
#[IsGranted('ROLE_USER')]with workspace/project ownership verification.Frontend (Stimulus + Twig)
Two-controller architecture:
photo_builder_controller.ts— page orchestrator managing session lifecycle, polling, global state, and inter-controller coordinationphoto_image_controller.ts— per-card controller for individual image state, prompt editing, and UI feedbackKey UX features implemented beyond the original plan:
docs/frontendbook.md)Template
photo_builder.twig— responsive image grid with loading overlay, user prompt section, per-image cards (preview, prompt textarea, keep checkbox, regenerate/upload buttons), and embed CTA. Usesetfswui-*styleguide classes throughout.Content Editor Integration
dist_files_controller.ts: camera icon next to each page file that navigates to PhotoBuilder?prefill=Embed images a.jpg, b.jpg into page x.htmlquery param, pre-filling the instruction textareaprefillMessagefrom URL and populates the inputProject Settings: Hierarchical LLM Configuration
The existing single
llmApiKey/llmModelProviderfields were renamed tocontentEditingLlmApiKey/contentEditingLlmModelProvider(scoped to content editing). New optionalphotoBuilderLlm*fields were added with automatic fallback to content editing settings.Project settings UI (
project_form.twig) extended with:Model selection:
gpt-image-1for image generationgemini-3-pro-image-previewfor image generation,gemini-3-flash-previewfor prompt generationTestHarness
src/PhotoBuilder/TestHarness/provides fake adapters for local development:FakePromptGenerator: returns canned prompts without calling an LLMFakeImageGenerator: generates placeholder images without API calls.envflags:PHOTO_BUILDER_SIMULATE_IMAGE_PROMPT_GENERATION,PHOTO_BUILDER_SIMULATE_IMAGE_GENERATIONDatabase Migrations
4 migrations:
Version20260210112717— createphoto_sessionsandphoto_imagestablesVersion20260211081136— adduploaded_to_media_store_attophoto_imagesVersion20260211082223— adduploaded_file_nametophoto_imagesVersion20260211110000— rename LLM fields to scoped names, add PhotoBuilder-specific LLM columnsCross-Cutting Concerns
#[IsGranted('ROLE_USER')]+ workspace/project ownership verificationnew DateTimeImmutable())pageandconversationIdquery params when switching localeDocumentation
docs/vertical-wiring.mdupdated with PhotoBuilder facade dependenciesdocs/frontendbook.mdupdated with parent-to-child event dispatch patterndocs/llm-usage-book.mdadded — documents all LLM concerns and provider configurationTest Coverage
PHP unit tests (in
tests/Unit/PhotoBuilder/):Frontend tests (Vitest, in
tests/frontend/unit/PhotoBuilder/):photo_builder_controller.test.ts— session lifecycle, polling, prompt regeneration, upload flow, manifest polling, resolution togglephoto_image_controller.test.ts— state updates, prompt editing, keep checkbox, button state management, cache-bustingdist_files_controller.test.ts(PhotoBuilder CTA) andchat_based_content_editor_controller.test.ts(prefill message)Stats: 78 files changed, ~8,900 lines added, ~340 lines removed.
Made with Cursor